375 research outputs found

    PCA and K-Means decipher genome

    Full text link
    In this paper, we aim to give a tutorial for undergraduate students studying statistical methods and/or bioinformatics. The students will learn how data visualization can help in genomic sequence analysis. Students start with a fragment of genetic text of a bacterial genome and analyze its structure. By means of principal component analysis they ``discover'' that the information in the genome is encoded by non-overlapping triplets. Next, they learn how to find gene positions. This exercise on PCA and K-Means clustering enables active study of the basic bioinformatics notions. Appendix 1 contains program listings that go along with this exercise. Appendix 2 includes 2D PCA plots of triplet usage in moving frame for a series of bacterial genomes from GC-poor to GC-rich ones. Animated 3D PCA plots are attached as separate gif files. Topology (cluster structure) and geometry (mutual positions of clusters) of these plots depends clearly on GC-content.Comment: 18 pages, with program listings for MatLab, PCA analysis of genomes and additional animated 3D PCA plot

    Chiral Polymerization in Open Systems From Chiral-Selective Reaction Rates

    Full text link
    We investigate the possibility that prebiotic homochirality can be achieved exclusively through chiral-selective reaction rate parameters without any other explicit mechanism for chiral bias. Specifically, we examine an open network of polymerization reactions, where the reaction rates can have chiral-selective values. The reactions are neither autocatalytic nor do they contain explicit enantiomeric cross-inhibition terms. We are thus investigating how rare a set of chiral-selective reaction rates needs to be in order to generate a reasonable amount of chiral bias. We quantify our results adopting a statistical approach: varying both the mean value and the rms dispersion of the relevant reaction rates, we show that moderate to high levels of chiral excess can be achieved with fairly small chiral bias, below 10%. Considering the various unknowns related to prebiotic chemical networks in early Earth and the dependence of reaction rates to environmental properties such as temperature and pressure variations, we argue that homochirality could have been achieved from moderate amounts of chiral selectivity in the reaction rates.Comment: 15 pages, 6 figures, accepted for publication in Origins of Life and Evolution of Biosphere

    Hidden Cues in Random Line Stereograms

    Get PDF
    Successful fusion of random-line stereograms with breaks in the vernier acuity range has been interpreted to suggest that the interpolation process underlying hyperacuity is parallel and preliminary to stereomatching. In this paper (a) we demonstrate with computer experiments that vernier cues are not needed to solve the stereomatching problem posed by these stereograms and (b) we provide psychophysical evidence that human stereopsis probably does not use vernier cues alone to achieve fusion of these random-line stereograms.MIT Artificial Intelligence Laborator

    Sequence-modification in copoly(ester-imide)s: a catalytic/supramolecular approach to the evolution and reading of copolymer sequence-information

    Get PDF
    Catalytic ester-interchange reactions, analogous to mutation and recombination, allow new sequence-information to be written, statistically, into NDI-based poly(ester-imide) chains. Thus, both insertion of the cyclic ester cyclopentadecanolide ("exaltolide") into an NDI-based homopolymer, and quantitative sequence-exchange between two different homopoly(ester-imide)s, are catalysed by di-n-butyl tin(IV) oxide. Emerging sequences are identified at the triplet and quintet levels using supramolecular complexation of pyrene-d10 at the NDI residues to amplify the separation of 1H NMR resonances associated with different sequences. In such systems, pyrene is able to act as a "reader-molecule" by generating different levels of ring-current shielding from the different patterns of supramolecular binding to all the NDI-centred sequences of a given length

    Base-Pairing Versatility Determines Wobble Sites in tRNA Anticodons of Vertebrate Mitogenomes

    Get PDF
    BACKGROUND: Vertebrate mitochondrial genomes typically have one transfer RNA (tRNA) for each synonymous codon family. This limited anticodon repertoire implies that each tRNA anticodon needs to wobble (establish a non-Watson-Crick base pairing between two nucleotides in RNA molecules) to recognize one or more synonymous codons. Different hypotheses have been proposed to explain the factors that determine the nucleotide composition of wobble sites in vertebrate mitochondrial tRNA anticodons. Until now, the two major postulates--the "codon-anticodon adaptation hypothesis" and the "wobble versatility hypothesis"--have not been formally tested in vertebrate mitochondria because both make the same predictions regarding the composition of anticodon wobble sites. The same is true for the more recent "wobble cost hypothesis". PRINCIPAL FINDINGS: In this study we have analyzed the occurrence of synonymous codons and tRNA anticodon wobble sites in 1553 complete vertebrate mitochondrial genomes, focusing on three fish species with mtDNA codon usage bias reversal (L-strand is GT-rich). These mitogenomes constitute an excellent opportunity to study the evolution of the wobble nucleotide composition of tRNA anticodons because due to the reversal the predictions for the anticodon wobble sites differ between the existing hypotheses. We observed that none of the wobble sites of tRNA anticodons in these unusual mitochondrial genomes coevolved to match the new overall codon usage bias, suggesting that nucleotides at the wobble sites of tRNA anticodons in vertebrate mitochondrial genomes are determined by wobble versatility. CONCLUSIONS/SIGNIFICANCE: Our results suggest that, at wobble sites of tRNA anticodons in vertebrate mitogenomes, selection favors the most versatile nucleotide in terms of wobble base-pairing stability and that wobble site composition is not influenced by codon usage. These results are in agreement with the "wobble versatility hypothesis"

    PCA Beyond The Concept of Manifolds: Principal Trees, Metro Maps, and Elastic Cubic Complexes

    Full text link
    Multidimensional data distributions can have complex topologies and variable local dimensions. To approximate complex data, we propose a new type of low-dimensional ``principal object'': a principal cubic complex. This complex is a generalization of linear and non-linear principal manifolds and includes them as a particular case. To construct such an object, we combine a method of topological grammars with the minimization of an elastic energy defined for its embedment into multidimensional data space. The whole complex is presented as a system of nodes and springs and as a product of one-dimensional continua (represented by graphs), and the grammars describe how these continua transform during the process of optimal complex construction. The simplest case of a topological grammar (``add a node'', ``bisect an edge'') is equivalent to the construction of ``principal trees'', an object useful in many practical applications. We demonstrate how it can be applied to the analysis of bacterial genomes and for visualization of cDNA microarray data using the ``metro map'' representation. The preprint is supplemented by animation: ``How the topological grammar constructs branching principal components (AnimatedBranchingPCA.gif)''.Comment: 19 pages, 8 figure

    Molecular architecture and activation of the insecticidal protein Vip3Aa from Bacillus thuringiensis

    Get PDF
    Bacillus thuringiensis Vip3 (Vegetative Insecticidal Protein 3) toxins are widely used in biotech crops to control Lepidopteran pests. These proteins are produced as inactive protoxins that need to be activated by midgut proteases to trigger cell death. However, little is known about their three-dimensional organization and activation mechanism at the molecular level. Here, we have determined the structures of the protoxin and the protease-activated state of Vip3Aa at 2.9 Å using cryo-electron microscopy. The reconstructions show that the protoxin assembles into a pyramid-shaped tetramer with the C-terminal domains exposed to the solvent and the N-terminal region folded into a spring-loaded apex that, after protease activation, drastically remodels into an extended needle by a mechanism akin to that of influenza haemagglutinin. These results provide the molecular basis for Vip3 activation and function, and serves as a strong foundation for the development of more efficient insecticidal proteins

    The Mechanisms of Codon Reassignments in Mitochondrial Genetic Codes

    Get PDF
    Many cases of non-standard genetic codes are known in mitochondrial genomes. We carry out analysis of phylogeny and codon usage of organisms for which the complete mitochondrial genome is available, and we determine the most likely mechanism for codon reassignment in each case. Reassignment events can be classified according to the gain-loss framework. The gain represents the appearance of a new tRNA for the reassigned codon or the change of an existing tRNA such that it gains the ability to pair with the codon. The loss represents the deletion of a tRNA or the change in a tRNA so that it no longer translates the codon. One possible mechanism is Codon Disappearance, where the codon disappears from the genome prior to the gain and loss events. In the alternative mechanisms the codon does not disappear. In the Unassigned Codon mechanism, the loss occurs first, whereas in the Ambiguous Intermediate mechanism, the gain occurs first. Codon usage analysis gives clear evidence of cases where the codon disappeared at the point of the reassignment and also cases where it did not disappear. Codon disappearance is the probable explanation for stop to sense reassignments and a small number of reassignments of sense codons. However, the majority of sense to sense reassignments cannot be explained by codon disappearance. In the latter cases, by analysis of the presence or absence of tRNAs in the genome and of the changes in tRNA sequences, it is sometimes possible to distinguish between the Unassigned Codon and Ambiguous Intermediate mechanisms. We emphasize that not all reassignments follow the same scenario and that it is necessary to consider the details of each case carefully.Comment: 53 pages (45 pages, including 4 figures + 8 pages of supplementary information). To appear in J.Mol.Evo

    The concept of RNA-assisted protein folding: the role of tRNA

    Get PDF
    We suggest that tRNA actively participates in the transfer of 3D information from mRNA to peptides - in addition to its well-known, "classical" role of translating the 3-letter RNA codes into the one letter protein code. The tRNA molecule displays a series of thermodynamically favored configurations during translation, a movement which places the codon and coded amino acids in proximity to each other and make physical contact between some amino acids and their codons possible. This specific codon-amino acid interaction of some selected amino acids is necessary for the transfer of spatial information from mRNA to coded proteins, and is known as RNA-assisted protein folding
    corecore